fix: prevent spurious response headers when streaming thinking-model tokens by fubak · Pull Request #8 · GetSmallAI/SmallHarness

fubak · 2026-06-14T21:51:42Z

Problem

When using a thinking/reasoning model (qwen3, deepseek-r1, etc.) via Ollama, the interactive TUI renders a new response ── header for every individual reasoning token — producing hundreds of blank panels and making the session unusable.

Root cause: Two independent bugs that compound each other:

agent.rs — The streaming loop emits AgentEvent::Text for every content delta, including the empty strings ("") that Ollama sends alongside each reasoning token during the thinking phase. render_text("") opens a fresh answer panel (printing a new header) even though there is nothing to render.
renderer.rs — The Reasoning event handler calls end_answer() unconditionally, closing whichever answer panel was just opened. This leaves answer_wrap = None, so the next empty Text delta opens yet another panel. With hundreds of thinking tokens, this cycle produces hundreds of spurious headers — even when display.reasoning is false.

Fix

Three minimal changes:

agent.rs (1 line): Skip AgentEvent::Text emission when the content delta is empty.
renderer.rs (3 lines): Only call end_answer() in the Reasoning handler when display.reasoning is true; when reasoning display is off, the answer panel should be unaffected by reasoning tokens.
renderer.rs (3 lines): Guard render_text() with an early return on empty delta as defence-in-depth against future empty deltas from any backend.

Tests

Three unit tests added to renderer.rs:

empty_text_delta_does_not_open_answer_block — empty Text event leaves answer_wrap as None
reasoning_event_preserves_answer_block_when_reasoning_display_off — Reasoning event does not close an in-progress answer when reasoning: false
reasoning_event_closes_answer_block_when_reasoning_display_on — Reasoning event correctly closes the answer block when reasoning: true (existing intended behaviour)

All 383 existing tests continue to pass.

Verified with

qwen3:8b via Ollama 0.30.8 on an RTX 2080 Super (SM 75)
qwen2.5:7b via Ollama 0.30.8 (baseline — no thinking tokens, unaffected)

When a thinking model (qwen3, deepseek-r1) is used via Ollama, the OpenAI-compat streaming API sends per-token chunks that may include an empty content string alongside each reasoning token. Before this change two bugs combined to create a 'new response header per reasoning word' effect in the TUI: 1. agent.rs emitted AgentEvent::Text for every content delta, including empty strings. render_text('') would open a fresh answer panel (printing a new 'response ──' header) even though there was nothing to show. 2. renderer.rs called end_answer() unconditionally on every Reasoning event. This closed the just-opened (empty) answer panel, leaving answer_wrap = None so the next empty Text delta would open yet another panel. With hundreds of thinking tokens this produced hundreds of spurious headers even with reasoning display turned off. Fixes: - agent.rs: skip AgentEvent::Text emission when content delta is empty. - renderer.rs: only call end_answer() in the Reasoning handler when reasoning display is on (otherwise the panel state is unaffected). - renderer.rs: guard render_text() with an early return on empty delta as a belt-and-suspenders safety against future empty deltas. Tested with qwen3:8b and qwen2.5:7b via Ollama 0.30.8.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adjusts rendering/event emission to handle “thinking model” streams that include empty content deltas alongside reasoning, preventing per-token answer block resets and header spam.

Changes:

Only close the current answer block on Reasoning events when reasoning display is enabled.
Ignore empty text deltas in the renderer and avoid emitting empty Text events in the agent.
Add targeted regression tests covering empty text deltas and reasoning toggle behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
src/renderer.rs	Prevents empty text deltas and (when disabled) reasoning tokens from disrupting the answer block; adds regression tests.
src/agent.rs	Stops emitting `AgentEvent::Text` for empty `content` deltas to reduce downstream rendering churn.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+                if self.display.reasoning {
+                    self.end_answer();
+                }
                self.render_reasoning(&delta)
            }


bennewell35

I reviewed the diff and ran the upstream-style checks in a disposable Docker Rust environment. The behavior tests pass, but the PR currently fails the required clippy gate.

Validation commands:

docker run --rm -v /tmp/smallharness-pr8:/work -w /work rust:latest bash -c 'rustup component add rustfmt clippy >/dev/null && rustc --version && cargo --version && cargo fmt --all -- --check && cargo clippy --all-targets -- -D warnings && cargo test'\n```\n\nResult:\n- `rustc 1.96.0` / `cargo 1.96.0`\n- `cargo fmt --all -- --check` passed\n- `cargo clippy --all-targets -- -D warnings` failed\n- Separate `cargo test` passed: 383 passed, 0 failed\n\nClippy failure:\n\n```text\nerror: field assignment outside of initializer for an instance created with Default::default()\n   --> src/renderer.rs:820:9\n    |\n820 |         cfg.reasoning = false;\n\nerror: field assignment outside of initializer for an instance created with Default::default()\n   --> src/renderer.rs:826:9\n    |\n826 |         cfg.reasoning = true;\n```\n\nMinimal fix should be to initialize those helpers directly, e.g.\n\n```rust\nfn renderer_reasoning_off() -> TuiRenderer {\n    TuiRenderer::new(DisplayConfig {\n        reasoning: false,\n        ..Default::default()\n    })\n}\n\nfn renderer_reasoning_on() -> TuiRenderer {\n    TuiRenderer::new(DisplayConfig {\n        reasoning: true,\n        ..Default::default()\n    })\n}\n```\n\nI did not live-test qwen3/deepseek-r1 via Ollama, so this is code/test validation rather than independent model-runtime validation. The patch looks narrow and the added renderer tests cover the stated state transitions, but it needs the clippy fix before it can pass the documented project gate.

…ests Initialize DisplayConfig with a struct literal + ..Default::default() instead of mutating fields after default(), so the repo's `cargo clippy --all-targets -- -D warnings` gate passes in CI. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings June 14, 2026 21:51

Copilot AI reviewed Jun 14, 2026

View reviewed changes

Comment thread src/renderer.rs

Comment on lines +378 to 382

if self.display.reasoning {

self.end_answer();

}

self.render_reasoning(&delta)

}

bennewell35 reviewed Jun 26, 2026

View reviewed changes

morganlinton merged commit d512fc5 into GetSmallAI:main Jun 29, 2026
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: prevent spurious response headers when streaming thinking-model tokens#8

fix: prevent spurious response headers when streaming thinking-model tokens#8
morganlinton merged 2 commits into
GetSmallAI:mainfrom
fubak:fix/thinking-model-tui-rendering

fubak commented Jun 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

bennewell35 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Uh oh!

Conversation

fubak commented Jun 14, 2026

Problem

Fix

Tests

Verified with

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

bennewell35 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants